NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Runtime Composition of Iterations for Fusing Loop-carried Sparse Dependence

https://doi.org/10.1145/3581784.3607097

Cheshmi, Kazem; Strout, Michelle; Mehri Dehnavi, Maryam (November 2023, ACM)

Dependence between iterations in sparse computations causes inefficient use of memory and computation resources. This paper proposes sparse fusion, a technique that generates efficient parallel code for the combination of two sparse matrix kernels, where at least one of the kernels has loop-carried dependencies. Existing implementations optimize individual sparse kernels separately. However, this approach leads to synchronization overheads and load imbalance due to the irregular dependence patterns of sparse kernels, as well as inefficient cache usage due to their irregular memory access patterns. Sparse fusion uses a novel inspection strategy and code transformation to generate parallel fused code optimized for data locality and load balance. Sparse fusion outperforms the best of unfused implementations using ParSy and MKL by an average of 4.2× and is faster than the best of fused implementations using existing scheduling algorithms, such as LBC, DAGP, and wavefront by an average of 4× for various kernel combinations.
more » « less
Polyhedral Specification and Code Generation of Sparse Tensor Contraction with Co-iteration

https://doi.org/10.1145/3566054

Zhao, Tuowen; Popoola, Tobi; Hall, Mary; Olschanowsky, Catherine; Strout, Michelle (March 2023, ACM Transactions on Architecture and Code Optimization)

This article presents a code generator for sparse tensor contraction computations. It leverages a mathematical representation of loop nest computations in the sparse polyhedral framework (SPF), which extends the polyhedral model to support non-affine computations, such as those that arise in sparse tensors. SPF is extended to perform layout specification, optimization, and code generation of sparse tensor code: (1) We develop a polyhedral layout specification that decouples iteration spaces for layout and computation; and (2) we develop efficient co-iteration of sparse tensors by combining polyhedra scanning over the layout of one sparse tensor with the synthesis of code to find corresponding elements in other tensors through an SMT solver. We compare the generated code with that produced by a state-of-the-art tensor compiler, TACO. We achieve on average 1.63× faster parallel performance than TACO on sparse-sparse co-iteration and describe how to improve that to 2.72× average speedup by switching the find algorithms. We also demonstrate that decoupling iteration spaces of layout and computation enables additional layout and computation combinations to be supported.
more » « less
Full Text Available
Code Synthesis for Sparse Tensor Format Conversion and Optimization

Popoola, Tobi; Zhao, Tuowen; St. George, Aaron; Bhetwal, Kalyan; Strout, Michelle; Hall, Mary; Olschanowsky, Catherine (February 2023, International Symposium on Code Generation and Optimization)

Many scientific applications compute on sparse data and use a variety of sparse formats because each format has unique space and performance benefits. Optimizing applications that use sparse data involves translating the sparse data into the chosen format and transforming the computation to iterate over that format. This paper presents a formal definition of sparse tensor formats and an automated approach to synthesize the transformation between formats. This approach is unique in that it supports ordering constraints not supported by other approaches and synthesizes the transformation code in a high-level intermediate representation suitable for applying composable transformations such as loop fusion and temporary storay reduction. We demonstrate that the synthesized code for COO to CSR with optimizations is 3.4X faster than TACO, Intel MKL and SPARSKIT while the more complex COO to DIA is slower than TACO but competitive with Intel MKL and SPARSKIT.
more » « less
Full Text Available
Code Synthesis for Sparse Tensor Format Conversion and Optimization

https://doi.org/10.1145/3579990.3580021

Popoola, Tobi; Zhao, Tuowen; St. George, Aaron; Bhetwal, Kalyan; Strout, Michelle Mills; Hall, Mary; Olschanowsky, Catherine (February 2023, ACM)

Many scientific applications compute using sparse data and store that data in a variety of sparse formats because each format has unique space and performance benefits. Optimizing applications that use sparse data involves translating the sparse data into the chosen format and transforming the computation to iterate over that format. This paper presents a formal definition of sparse tensor formats and an automated approach to synthesize the transformation between formats. This approach is unique in that it supports ordering constraints not supported by other approaches and synthesizes the transformation code in a high-level intermediate representation suitable for applying composable transformations such as loop fusion and temporary storage reduction. We demonstrate that the synthesized code for COO to CSR with optimizations is 2.85x faster than TACO, Intel MKL, and SPARSKIT while the more complex COO to DIA is 1.4x slower than TACO but faster than SPARSKIT and Intel MKL using the geometric average of execution time.
more » « less
An Object-Oriented Interface to The Sparse Polyhedral Library

https://doi.org/10.1109/COMPSAC51774.2021.00275

Popoola, Tobi; Shankar, Ravi; Rift, Anna; Singh, Shivani; Davis, Eddie C.; Strout, Michelle Mills; Olschanowsky, Catherine (July 2021, 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC))

Many important applications including machine learning, molecular dynamics, and computational fluid dynamics, use sparse data. Processing sparse data leads to non-affine loop bounds and frustrates the use of the polyhedral model for code transformation. The Sparse Polyhedral Framework (SPF) addresses limitations of the Polyhedral model by supporting non-affine constraints in sets and relations using uninterpreted functions. This work contributes an object-oriented API that wraps the SPF intermediate representation (IR) and integrates the Inspector/Executor Generation Library and Omega+ for precise set and relation manipulation and code generation. The result is a well-specified definition of a full computation using the SPF IR. The API provides a single entry point for tools to interact with the SPF, generate and manipulate polyhedral data flow graphs, and transform sparse applications.
more » « less
Full Text Available
MatRox: modular approach for improving data locality in hierarchical (Mat)rix App(Rox)imation

https://doi.org/10.1145/3332466.3374548

Liu, Bangtian; Cheshmi, Kazem; Soori, Saeed; Strout, Michelle Mills; Dehnavi, Maryam Mehri (February 2020, PPoPP '20: Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)

Full Text Available
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism

https://doi.org/10.1109/SC.2018.00065

Cheshmi, Kazem; Kamil, Shoaib; Strout, Michelle Mills; Dehnavi, Maryam Mehri (November 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis)

Full Text Available
Sympiler: transforming sparse matrix codes by decoupling symbolic analysis

https://doi.org/10.1145/3126908.3126936

Cheshmi, Kazem; Kamil, Shoaib; Strout, Michelle Mills; Dehnavi, Maryam Mehri (January 2017, SC '17 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis)

Full Text Available

Search for: All records